51 research outputs found

    Segmenting broadcast news streams using lexical chains

    Get PDF
    In this paper we propose a course-grained NLP approach to text segmentation based on the analysis of lexical cohesion within text. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e. distinct news stories from broadcast news programmes. Our system SeLeCT first builds a set of lexical chains, in order to model the discourse structure of the text. A boundary detector is then used to search for breaking points in this structure indicated by patterns of cohesive strength and weakness within the text. We evaluate this technique on a test set of concatenated CNN news story transcripts and compare it with an established statistical approach to segmentation called TextTiling

    SeLeCT: a lexical cohesion based news story segmentation system

    Get PDF
    In this paper we compare the performance of three distinct approaches to lexical cohesion based text segmentation. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e., distinct news stories from broadcast news programmes. Our approach to news story segmentation (the SeLeCT system) is based on an analysis of lexical cohesive strength between textual units using a linguistic technique called lexical chaining. We evaluate the relative performance of SeLeCT with respect to two other cohesion based segmenters: TextTiling and C99. Using a recently introduced evaluation metric WindowDiff, we contrast the segmentation accuracy of each system on both "spoken" (CNN news transcripts) and "written" (Reuters newswire) news story test sets extracted from the TDT1 corpus

    My Own Venetian Rose

    Get PDF
    https://digitalcommons.library.umaine.edu/mmb-vp/2220/thumbnail.jp

    An analysis of interactions within and between extreme right communities in social media

    Get PDF
    Many extreme right groups have had an online presence for some time through the use of dedicated websites. This has been accompanied by increased activity in social media websites in recent years, which may enable the dissemination of extreme right content to a wider audience. In this paper, we present exploratory analysis of the activity of a selection of such groups on Twitter, using network representations based on reciprocal follower and mentions interactions. We find that stable communities of related users are present within individual country networks, where these communities are usually associated with variants of extreme right ideology. Furthermore, we also identify the presence of international relationships between certain groups across geopolitical boundaries

    Uncovering the Wider Structure of Extreme Right Communities Spanning Popular Online Networks

    Get PDF
    Recent years have seen increased interest in the online presence of extreme right groups. Although originally composed of dedicated websites, the online extreme right milieu now spans multiple networks, including popular social media platforms such as Twitter, Facebook and YouTube. Ideally therefore, any contemporary analysis of online extreme right activity requires the consideration of multiple data sources, rather than being restricted to a single platform. We investigate the potential for Twitter to act as a gateway to communities within the wider online network of the extreme right, given its facility for the dissemination of content. A strategy for representing heterogeneous network data with a single homogeneous network for the purpose of community detection is presented, where these inherently dynamic communities are tracked over time. We use this strategy to discover and analyze persistent English and German language extreme right communities.Comment: 10 pages, 11 figures. Due to use of "sigchi" template, minor changes were made to ensure 10 page limit was not exceeded. Minor clarifications in Introduction, Data and Methodology section

    There\u27s Only One that I Would Lose My Sleep for, (and That\u27s for Daddy)

    Get PDF
    https://digitalcommons.library.umaine.edu/mmb-vp/6373/thumbnail.jp
    corecore